Embedded standard products reduce the overall design load on the system design engineer and allow the creation of complex designs in the shortest possible time.
By Wai-Leng Lim
A new generation of system-level programmable products, known variously as embedded standard products (ESPs), application specific programmable products (ASPPs), or field programmable system chips (FPSCs), integrate embedded functions with programmable logic and memory on a single device. This approach enables design engineers to quickly develop extremely sophisticated yet cost-effective systems in the shortest possible times. Because ESP devices are so highly integrated, however, engineers are faced with a new set of design challenges including configuring the embedded functions, incorporating the embedded functions with their own custom-designed logic, functional and timing simulation, and hardware testing and debugging.
Figure 1 - Keeping it simple
|
|
ESPs include three basic components: an embedded standard function, a general purpose programmable logic array, and embedded memory.
|
In their most simple form, ESPs include three basic components: an embedded standard function, a general-purpose programmable logic array, and embedded memory (see Figure 1). The embedded functions in ESP devices are typically implemented using standard-cell or full-custom design approaches, therefore they generally offer higher levels of performance and more features at a lower cost than can the same function implemented in programmable logic. They can also offer specialized analog functions impossible to implement in programmable logic. A further advantage of embedded functions comes from the fact that the ESP device manufacturers rigorously test their functionality and timing - essentially pre-verifying what will ultimately become a substantial part of the final design.
Addressing flexibility
Although the embedded functions in some ESP devices have a degree of customizability, their obvious potential downside is sacrificing the level of flexibility offered by general-purpose programmable logic. ESP devices make up for this lack of flexibility by providing general-purpose programmable logic on the same device. This programmable logic allows design engineers to fully customize part of their design while maintaining the benefits of embedded functions. Thus ESPs combine the high performance, guaranteed functionality, and low cost of standard products with the flexibility of programmable logic.
The design flow for ESPs is similar to the design flow for high-density general-purpose programmable logic devices, such as large FPGAs, yet differs in a few critical ways (see Figure 2). This flow includes design entry, functional simulation, logic synthesis, place and route, timing simulation, and hardware verification. We'll examine each of those steps in more detail and explain how they are unique for ESP-based designs.
Figure 2 - Distuingishing the differences
|
|
The design flow for ESPs includes design entry, functional simulation, logic synthesis, place and route, timing simulation, and hardware verification.
|
There are several techniques available to the ASIC or FPGA designer to help manage the complexity inherent in large designs. These include modular design, hierarchical design, and design reuse. For ESP-based designs, these approaches are required and therefore the first step in developing an ESP design is to partition it intological sub-functions. At the highest level of the hierarchy, there will be two blocks - the embedded function and the user's custom logic. In effect, the embedded function becomes a "black box" instantiated block in the programmable logic design.
The next step is to define the functionality for the programmable logic side of the design all the way down to the lowest level functional blocks. Then each block is drawn or coded, verified, and added to the design. The on-chip memory (typically dual-port SRAM) available in ESPs is generally also integrated into the design as instantiated blocks at any level of the hierarchy.
Once the design has been partitioned and the various modules coded, the next step is to configure the embedded function. This step may be as simple as clicking your way through an embedded function software "wizard" user interface, or as complex as developing your own soft IP to run on the embedded function hardware.
Simulation, synthesis, and P&R
Finally, the embedded function is integrated with the custom logic at the highest level of the design hierarchy and the designer is ready for functional simulation.
Functional simulation for the entire design should be straightforward if the programmable logic portion of the design has been carefully constructed, since the embedded function has already been functionally characterized and tested by the ESP device manufacturer.
As with most third party supplied ASIC or FPGA soft IP, the embedded functions in ESP devices are typically provided with software simulation models and test benches. Thus, even though the embedded functions are black boxes, their functionality can easily be evaluated during simulation phases of the design.
Once the design engineer is satisfied that the design is functioning properly, the logic synthesis process can begin. This step is almost exactly the same for an ASIC or FPGA design, except that the synthesis time is shorter than it would otherwise be since no synthesis is necessary for the embedded function, which typically represents the largest portion of the design (by gate count).
While the place-and-route process for ESPs is also similar to that for FPGAs, the different architecture and even the technology used to build ESPs can change things somewhat. For example, no placement-and-routing process is necessary for the embedded function. Since the embedded function can represent as much as 90 percent of the total gates in an ESP-based design, the overall time for the place-and-route process can be much shorter than for an equivalent-sized design implemented completely in general purpose programmable logic.
Design engineers do have to run place-and-route tools to implement the programmable logic portion of their design, however. One aspect of this process that is unique to ESPs is managing signal connections between the embedded function and the user's custom logic.
This issue can potentially prevent successful design completion; therefore it bears closer examination.
One of the biggest potential benefits of ESPs is that they can dramatically increase system performance over a design built using multiple discrete devices. This increase comes partially from eliminating on-chip/off-chip delays, but more significantly from simply providing wider signal paths between functions. For example, there can easily be 300 to 500 signals crossing the boundary between embedded functions and the user programmable logic in a typical ESP device. While some large general-purpose PLDs have nearly this many I/Os, none have even half that many along one side of the die.
To make matters more complex, some programmable logic technologies, particularly SRAM, have limited interconnect resources that can create routing bottlenecks even in general-purpose devices with the routing distributed to all four sides of the device. In many cases, design engineers have to contend with changing the layout of their printed circuit boards to be able to successfully place and route their FPGA design, as the place-and-route software re-routes I/O pin signals during design iterations. That approach simply isn't possible when the routing is completely internal to the device, as it is for signals crossing the embedded function/programmable logic boundary in an ESP device.
Figure 3 - The right stuff
|
|
Choosing the right programmable technology is often one of the key decisions made during the ESP-based design process.
|
Other technologies, such as Vialink programmable metal, allow large amounts of interconnect resources to be placed inside of ESP devices, reducing or even eliminating the ESP boundary routing problem. Therefore, choosing the right programmable technology for an ESP-based design can be one of the most important decisions made during the design development process (see Figure 3).
Timing and verification
After successfully placing and routing the design, the next step in the design flow is timing simulation. As with functional simulation, the process should go faster than for general-purpose programmable logic devices, as the device manufacturer should have already verified the embedded function's operational and timing characteristics and the embedded function is often the most complex element in the design.
In some cases, though, the embedded function in ESP devices is a hardware platform that must be configured by the design engineer - either through a set of PROM-based switches, or through soft IP. These configurable embedded functions add a degree of flexibility for the design engineer, but also increase the need to perform more thorough timing verification. The case of PROM-based switches is more straightforward with respect to timing issues, since the manufacturer will have tested at least all of the most likely switch settings. However, it's possible that the user will choose a combination that hasn't been specifically tested by the manufacturer, since even 100 PROM bits create 2100 possible switch settings.
The case of soft IP is more complex because the critical paths in the design can change depending on how the soft IP is implemented on the embedded hardware platform. However, this case is still typically less complex than evaluating the timing of a soft IP function in general-purpose programmable logic, since embedded functions typically have less possible configurations than general-purpose logic does.
As with functional simulation, embedded functions are typically provided with test benches and simulation models which allow their operating characteristics to be thoroughly evaluated for timing simulation purposes.
Since ESP devices internalize many of the signals that would otherwise be easily accessible in a multiple discrete device design, the functional and timing simulation steps are even more critical than for general-purpose programmable logic devices. During simulation, these signals will be visible while they might not be during hardware verification. Due to their high level of integration, ESP devices often support very high complexity designs, which is further cause for careful simulation.
The final straw
The final step in the design development process for ESP devices is hardware verification. Interestingly enough, this step should require the least amount of time and simply offer a validation that each of the earlier steps was performed correctly.
If the design doesn't work correctly in hardware, it's best to go back and try to replicate the problem in simulation - rather than debugging the hardware directly. The degree of integration and the associated internalization of signals will limit an engineer's visibility of the problem through hardware. If, however, it's critical to debug in hardware, then the designer can make temporary design modifications to route internal signals out through I/O pins via the programmable logic array, thus increasing visibility into the design.
Although debugging ESP-based designs in hardware is typically more difficult than debugging discrete device or even FPGA-based designs, the good news is that system-level verification can be easier, since once the ESP is working correctly a substantial portion of the system design might then be complete.
The quality of echo
Following is an actual DSP design example to illustrate the design flow in a practical application, using Quicklogic's (San Jose) QuickDSP family of ESP devices.
Digital network echo cancellation is an important function in voice over IP (VoIP) systems. Canceling acoustic echoes and echoes caused by impedance mismatching via the subscriber line is essential for a VoIP system to maintain toll-quality service. Typically, echo cancellation is done by first comparing the voice data samples (which are encoded by codecs such as ADPCM or CS-CELP) from the sender path and the echo-"free" returning path using the LMS (least mean square) algorithm. Then the system must generate a set of coefficients that determine the characteristics of an adaptive finite impulse response (FIR) filter to mimic the echo. The mimicked echo is used to subtract the original echo from the returning path to the sender, effectively canceling the sender's echo. The quality of echo cancellation is as good as the ability of the adaptive FIR filter to replicate the echo. The design challenge is building an adaptive filter that will sustain effective echo cancellation for the maximum echo delay in the network.
The QuickDSP embedded standard products include ten to eighteen embedded hardware arithmetic units along with high-performance programmable logic and configurable dual-port SRAM. The hardware arithmetic units are called embedded computational units, or ECUs. Each ECU contains an 8-bit multiplier, 16-bit adder, and 16-bit register, and supports eight different operating modes including multiply, add, multiply/add, multiply/register, add/register (see Figure 4). A three-bit instruction configures each ECU independently of the other ECUs and can be changed dynamically during normal operation. Data for each ECU can come from or go to the programmable-logic array, the memory array, any of the other ECUs, or any device I/O pins.
In our example, the ECUs become the key elements for multiplying the echo cancellation adaptive FIR filter coefficients. The filter itself can be built as a hybrid of the ECUs and logic implemented in the programmable logic array in the devices, while the filter coefficients are stored in the on-chip SRAM. Although the design is complex, the software development tools incorporate a DSP function generator that automatically generates the necessary logic in the form or Verilog or VHDL source code - eliminating the need the design the filter manually.
Implementing the design
As with all complex designs, the first step in our example is to partition the design into two basic modules - one targeted at the embedded function and the other at the general-purpose programmable logic array. Here we have an especially interesting case, as the module that we have targeted at the embedded function (the adaptive FIR filter) will also require programmable logic and memory. While this fact makes our design partition a little more complicated than it might otherwise, it doesn't create additional problems in the design flow.
Figure 4 - The architecture within
|
|
The expanded ECU contains an 8-bit multiplier, 16-bit adder, and 16-bit register, and supports eight different operating modes.
|
Since the adaptive FIR filter module is somewhat of a special case, we'll focus on designing it first. Using a software tool such as Matlab or QuickFilter, the designer will specify the filter type and pass-band characteristics. Then the tool will generate the appropriate filter coefficients and show the response of the filter in the frequency and time domains.
After the designer is satisfied that the adaptive filter is working correctly, work begins on the modules in the other partition - those based purely on programmable logic and memory. Once all modules have been defined and drawn (in the case of schematics) or coded (in the case of HDL-based design entry), they can be functionally simulated. This cycle will be iterated until the designer believes the entire design to be working correctly.
Once proper functionality has been verified for all of the modules, they can be combined into a single top-level design and synthesized. Any synthesis errors or warnings should be addressed and then the designer can move on to place and route.
During the place-and-route phase, the designer should evaluate any potential routing congestion caused by signals crossing the ECU/programmable logic and ECU/embedded memory boundaries.
After successfully placing and routing the design, the designer should thoroughly evaluate the timing characteristics of the design through careful timing simulation. The timing of signals passing through the ECUs will depend on how the filter was implemented by the software and can change during the filter's operation. Therefore, the designer should be careful to simulate all cycles of the filter's operation.
The final step in completing our example design is hardware verification. Again, if all of the other steps were executed carefully, this step should be straightforward and the device should begin working completely correctly on power-up. If, however, the design doesn't work correctly, the best debugging procedure would be to go back to the simulation steps to identify the problem.
Reducing the load
ESPs combine embedded functions, programmable logic, and memory on a single device to provide engineers with a high-performance and highly integrated design solution. The level of complexity of designs targeted at these devices causes the design development process to be slightly more complex than that for general-purpose programmable logic devices, with more emphasis on debugging in simulation than in hardware. However, designs created carefully can actually be completed more quickly than similar designs in pure programmable logic, as the manufacturers of ESP-type devices have already implemented and verified the embedded functions. This fact reduces the overall design load on the system-design engineer and allows them to create very complex designs in the shortest possible time.
Wai-Leng Lim is a member of Quicklogic's DSP Engineering Group. His current focus is planning technical strategy for DSP embedded standard products. His background is in FPGA design flow and applications.
To voice an opinion on this or any other article in Integrated System Design, please e-mail your comments to sdean@cmp.com.
Send electronic versions of press releases to
news@isdmag.com
For more information about isdmag.com e-mail
webmaster@isdmag.com
Comments on our
editorial are welcome.
Copyright © 2000
Integrated System Design
Magazine